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Real Party in Interest 

The real party in interest comprises Microsoft Corporation, by way of 
assignment from Zuberec et al. who are the named inventors and are captioned in 
the present brief. The assignment document was recorded at Reel/Frame 
9609/0977 in the United States Patent and Trademark Office on 1 1/12/1998. 
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Related Appeals and Interferences 

None. 

Status of the Claims 

Claims 1, 3-9, 11-18, 20-35, 37 and 39 are pending. 

Status of Amendments 

No amendment has been filed subsequent to the final rejection. 

Summary 

A speech recognition system (SR) 20 of Fig. 1 is disclosed that includes an 
application 22, a vocabulary 24 accessible by the application 22 that holds a set of 
utterances applicable to the application, and a speech recognition engine 28 to 
recognize the utterances in the vocabulary 24. The speech recognition engine 28 
is configured to actively listen and recognize an utterance for a predetermined 
response time. If the speech engine 28 does not recognize an utterance within the 
predetermined amount of time, the speech recognition engine enters a dormant 
state, and remains in the dormant state until a starter word from the vocabulary 24 
is recognized. 

The application 22 includes a user interface 30 to provide both visual and 
auditory speech recognition engine 28 feedback to guide a user in a casual, eyes- 
off environment such as in a motor vehicle (see, the in-dash accessory 50 of Fig. 2, 
and the specification, page 9, line 14 through page 12, line 3). Specifically, the 
user interface 30 provides feedback to inform the user when the speech 
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recognition engine 28 is awaiting vocal input and to indicate whether the speech 
recognition engine 28 recognizes an utterance from the vocabulary 24. 

For instance, the user interface 30 includes an audio sound or speech 
generator 158 of Fig. 4 that produces three distinct sounds: a SR "on" sound 
signifying that the system is on and actively awaiting vocal input; an "off 5 sound 
indicating that the SR system is off and in a sleep mode; and a "confirm" sound 
noting that an utterance has been recognized. The "on" sound is triggered by a 
key "wake up" command or by depression of button (see, the specification, page 8, 
lines 3-20). Once awake, the speech recognition engine 28 expects to receive an 
utterance from the active vocabulary 26, which is part of the vocabulary 24, within 
a predetermined response time. The "confirm" sound signals the start of the 
response time. If the response time lapses before a recognizable utterance is 
entered, the "off ' sound is played. 

The user interface 30 further includes a visual component in the form of a 
graphic that changes with the tolling of the response period. In one 
implementation, the count graphic 310 of Figs. 7a through 7c is a progress bar that 
counts down or shortens in proportion to the diminishment of the response period. 
When the response time runs out, the progress bar disappears entirely. On the 
other hand, if the speech engine 28 recognizes an utterance within the response 
period, the user interface 30 plays the "confirm" sound and restarts the countdown 
graphic 310. The user interface 30 may also temporarily change the color of the 
graphic 310 elements from one color to another and then back to the original color 
to reflect a correct voice entry. 
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The user interface 30 may also be used in distributed collaboration efforts 
to facilitate conversations between remotely located people (see, the specification, 
page 14, line 17 through page 15, line 3). The visual display of Figs, 7a through 
7c tells a user when they can speak and how long they can speak before their turn 
"times out." 

Issues Presented 

1. Whether claims 1, 3, 6, 9, 13, 14, 16-18, 21-28, 31, 33-34, and 39 are 
properly rejected under 35 USC §103 as being unpatentable over U.S. 
Patent No. 6,01 8,71 1 to French St. George et al. (hereafter referred to as 
"St. George") in view of U.S. Patent No. 5,774,841 to Salazar et al. 
(hereinafter referred to as "Salazar"). 

2. Whether claims 4-5, 7-8, 11-12, 15, 20, 29-30, 32, 35, and 37 are 
properly rejected under 35 USC §103 as being unpatentable over St. 
George in view of Salazar as applied to claims 1,9, 18, 23, 27, and 34, 
and further in view of U.S. Patent No. 6,075,534 to VanBuskirk et al. 
(hereinafter referred to as "VanBuskirk"). Appellant respectfully 
traverses this rejection. 

Grouping of Claims 

The six (6) groups of pending claims listed below respectively stand or fall 

together: 

1. Claims 1, 3, 5-6, 9, 12-14, 16-18, 20-26 stand or fall together. 

2. Claims 4 and 1 1 stand or fall together. 
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3. Claims 7, 8, and 1 5 stand or fall together. 

4. Claims 27-32 stand or fall together. 

5. Claim 3 3 stands or falls by itself. 

6. Claims 34-39 stand or fall together. 

Argument 

CLAIMS 1, 3, 6, 9, 13, 14, 16-18, 21-28, 31, 33-34, and 39 ARE NOT 
Obvious Over St. George in View of Salazar in Conformity 

WITH35U.S.C.§103(A). 

Claims 1. 3-9. 11-18. 20-35. 37 and 39 stand rejected under 35 USC §103 
as being unpatentable over St. George in view of Salazar. The appellant traverses 
this rejection. 

Claim 1 recites "a speech recognition engine to recognize an utterance", 
"the speech recognition engine being configured to actively listen for the utterance 
for a predetermined response time". 

In support of the obviousness rejection, the 12/19/2002 advisory action, on 
page 2 argues that St. George teaches that "the time between the listening time and 
the machine response time is a minimal and therefore the speech recognition 
system can process the speech within the allowed input time." However, this is 
contrary to the express teaching of St. George. 

Specifically, St George teaches a system that delays sending a user's speech 
input to a speech recognition engine for analysis until after the user has had an 
opportunity to review, edit, and/or delete the input. (E.g., see St. George's 
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Abstract). St George teaches that this opportunity to review and change speech 
input before it is sent for speech recognition reduces the chance that invalid inputs 
will cause the system to advance to an erroneous state. Specifically, St. George 
teaches that the user is allowed to input speech signals, and edit or delete these 
signals for a bounded (but extendable) window of time. (St. George at col. 8, lines 
24-30). Only after the bounded window of time expires does St. George send "the 
aggregated speech sample" for speech recognition. 

Thus, even though St. George's uses the words "recognition window" to 
describe the bounded amount of time that the user has to input, edit, and/or delete 
speech signals, St. George does not interpret or "recognize" anything while speech 
input is being gathered. Instead, St. George at most teaches that audio signals are 
accepted during this "recognition window" and not sent for actual speech 
recognition until after the amount of time allotted to the "recognition window" has 
expired. Additionally, the appellant respectfully submits, that the speech 
recognition engine of St. George, after receiving any aggregated speech sample 
(i.e., after the window of time for providing, editing, and/or deleting the aggregate 
speech sample has closed), may take an infinite amount of time to recognize the 
contents of the received aggregated speech sample. This is especially the case 
since St. George does not teach or suggest any time limitation to understand (i.e., 
recognize) Speech input. 

For each of these reasons, the advisory action's assertion that "the time 
between the listening time and the machine response time is a minimal and 
therefore the speech recognition system can process the speech within the allowed 
input time" is contrary to the express teaching of St. George. 
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Moreover, the combination of St. George in view of Salazar does not cure 
the discussed deficiencies of St. George for the following reasons. 

As already noted, St. George teaches a system wherein no speech is 
recognized during the speech input response time. This is so a user will have the 
opportunity to edit or delete the speech input to avoid sending invalid speech data 
to the speech recognition engines — possibly causing an erroneous state. In light of 
this, the appellant respectfully submits, that the ACTION'S suggested modification 
to St, George's system with Salazar's system, which provides user feedback as to 
whether the speech recognition system recognizes user input, would require the 
speech recognition engine to interpret speech data as it was being input. Yet, this 
suggested modification proposes exactly what St. George teaches against and 
avoids. Rather than interpreting speech input as it is received, as the modification 
inherently proposes, St. George chooses instead to allow the user to edit and/or 
delete aggregated speech input before sending it for speech recognition. This 
reduces the possibility that invalid data (e.g., unrecognizable speech) will be sent 
to the speech recognition engine. 

Accordingly, this teaching of Salazar is contrary to the express teaching of 
St. George, which goes to great lengths to avoid sending any speech input to the 
speech recognition engine until the user has had the opportunity to edit or delete 
aggregated speech input. For these reasons, a person ordinary skill in the art at the 
time of invention would not have combined Salazar with St. George to arrive at 
appellant's claimed features. 

Furthermore, even if Salazar was an appropriate reference in combination 
with St. George under 35 U.S.C. 103(a), which appellant respectfully submits— is 
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• t 

not, the cited combination does not teach or suggest the recited features of claim 1. 
Specifically, Salazar teaches using a voice file to initialize a speech recognition 
system based on the speech particularities of a particular user. Once activated, the 
speech system of Salazar is always ready to receive speech input. Thus, Salazar 
does not teach or suggest "the speech recognition engine being configured to 
actively listen for the utterance for a predetermined response time". Rather, until 
Salazar's system is either manually or verbally inactivated (i.e., an inactivate 
command spoken by the user), Salazar's speech engine may wait for just about any 
amount of time, including an infinite^amount of time to recognize the user's speech 
input. (E.g., see Salazar, col. 8, lines 1-3). Waiting an infinite amount of time to 
receive speech input is clearly not "a predetermined amount of time" within the 
context of claim 1 — especially since claim 1 further recites "the speech 
recognition engine being configured to enter a dormant state if the utterance is not 
recognized within the predetermined amount of time". 

Claim l's further recited feature of "the speech recognition engine being 
configured to enter a dormant state if the utterance is not recognized within the 
predetermined amount of time" was also addressed by the final office action dated 
October 10, 2001 (hereinafter referred to as the "ACTION"). Specifically, on 
page 5 of the ACTION admits that St. George in view of Salazar does not teach or 
suggest "...sleep mode... awakened to an active mode upon detection of a starter 
utterance". Yet, the ACTION does not point out or provide any other evidence as 
to how the references teach or suggest the appellant's claimed features. Since data 
modifying the references of record in furtherance of the rejection are not specific 
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or otherwise supported, the appellant respectfully requests further evidentiary 
support from the office to support this rejection. 

Moreover, even in view of the lack of evidentiary support of this rejection, 
the appellant respectfully submits that nowhere do the references of record teach 
or suggest this recited feature of claim 1. For instance, as already discussed, St. 
George does not teach or suggest any speech recognition engine that is inactivated 
if it does not understand (i.e., recognize) any of the aggregated speech samples 
that are communicated to it. At most, with respect to limited amounts of time, St. 
George teaches that the user has only a specific amount of time to provide, edit, 
and/or delete speech signals — and this limitation is enforced before any speech 
signals are even communicated to the speech recognition engine. Additionally, 
once Salazar's system is activated and to receive speech input, Salazar system will 
wait for speech input, regardless of whether or not the speech input is recognized, 
until Salazar's system is either manually or verbally shut down (e.g., see Salazar, 
col. 8, lines 1-3). Thus, the references of record do not teach or suggest "the 
speech recognition engine being configured to enter a dormant state if the 
utterance is not recognized within the predetermined amount of time", as the 
appellant claims. 

Claim 1 further recites "a user interface to [...] display a countdown 
graphic that changes with lapsing of the predetermined response time" and "restart 
the countdown graphic in the event the speech recognition engine recognizes the 
utterance." In addressing this feature, the ACTION on page 3 concedes that 
neither St. George nor Salazar teach or suggest this recited feature. Yet, even in 
view of this admitted lack of teaching, the ACTION concludes that it would have 
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been obvious to further modify the St. George in view of Salazar because such a 
modification would "continually grant the user maximum response time for 
generating an utterance to be recognized". The appellant respectfully disagrees. 

For the reasons already discussed, St. George explicitly teaches that no 
speech is interpreted until after the window of time for receiving verbal signals has 
expired. St. George further teaches that a user resets the window of time for 
receiving speech input by providing tactile input such as a user button or key press. 
Nowhere does St. George teach or suggest "restart the countdown graphic in the 
event the speech recognition engine recognizes the utterance", as appellant claims. 

Moreover, Salazar's teaching of visual or audio feedback in response to the 
receipt of voice input does not cure this deficiency of St. George. Although, 
Salazar teaches providing visual or audio indications corresponding to whether 
speech input is recognized, Salazar is completely silent with respect to providing 
any visual feedback changing with lapsing of a predetermined response time, 
wherein the visual feedback is restarted when speech input is recognized. Rather, 
since neither Salazar or St. George teach or suggest any requirement to receive 
recognizable speech input within a predetermined amount of time as appellant 
does claim, it is highly unlikely that a person of ordinary skill in the art at the time 
of invention would have ever made such a modification to St. George, even in 
view of Salazar. 

For each of the above reasons, the references of record singly or in 
combination do not teach or suggest the features of claim 1. Accordingly, the 35 
U.S.C. 103(a) rejection of claim 1 is improper and should be overruled. 
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Claims 3 and 6 depend from claim 1 and are patentably distinguished over 
the references of record by virtue of this dependency. Accordingly, the 35 USC 
§103 rejection of claims 3 and 6 should be overruled. 

Claim 9 recites: 

Y—7 a grammar that holds a subset of the utterances in the 
vocabulary; 

a speech recognition engine to recognize the utterances in the 
grammar within a predeteimined response time, the speech 
recognition engine being configured to enter a dormant state if the 
utterances are not recognized within the predetermined response of 
time; and 

a user interface to display a countdown graphic that changes 
with lapsing of the response time, wherein the user interface restarts 
the countdown graphic in the event the speech recognition engine 
recognizes the one of the utterances. " 

For the reasons discussed above in reference to claim 1, the references of 
record, singly or in combination, do not teach or suggest the various features of 
claim 9. 

Accordingly, the 35 USC §103 rejection of claim 9 should be overruled. 

Claims 13, 14, 16, and 17 depend from claim 9 and are patentably 

distinguished over the references of record by virtue of this dependency. 

Accordingly, the 35 USC §103 rejection of claims 13, 14, 16, and 17 should be 

overruled. 

Claim 18 recites: 

Y—7 a graphic progress bar shown on the display that 
indicates a response time in which the speech recognition system is 
awaiting a user to speak, the progress bar shortening with passage 
of the response time, wherein the graphic progress bar is lengthened 
to its initial position after each recognized user input, wherein the 
user interface plays an audible sound when the speech recognition 
engine recognizes one of the utterances within the predetermined 
response time, and wherein the user interface indicates that the 
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speech recognition engine is in a dormant state when at least one of 
the utterances is not recognized within the predetermined response 
of time. " 

For the reasons already discussed, the references of record, singly or in 
combination, do not teach or suggest these features of claim 18. 

Accordingly, the 35 USC §103 rejection of claim 18 should be overruled. 

Claims 21 and 22 depend from claim 18 and are patentably distinguished 
over the references of record by virtue of this dependency. 

Accordingly, the 35 USC §103 rejection of claims 21 and 22 should be 
overruled. 
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Claim 23 recites 

"[—] a graphic shown on the display that indicates a fixed 
response time in which the speech recognition system is awaiting 
receipt of an utterance via the audio input, the graphic diminishing 
in size with the passage of time, the graphic returning to an original 
size after each recognized utterance; and 

an audio generator to emit a first audible sound when the 
speech recognition system recognizes the utterance, the audio 
generator being further configured to emit a second audible sound 
when the fixed response time has expired before the utterance has 
been recognized, the second sound indicating that the speech 
recognition system has entered a dormant state. " 

For the reasons already discussed, the references of record, singly or in 
combination, do not teach or suggest these features of claim 23. Accordingly, the 
35 USC §103 rejection of claim 23 should be overruled. 

Claims 24-26 depend from claim 23 and are patentably distinguished over 

the references of record by virtue of this dependency. Accordingly, the 35 USC 

§103 rejection of claims 24-26 should be overruled. 

Claim 27 recites: 

"A vehicle computer system comprising: 
a computer; 

an open platform operating system executing on the 
computer, the operating system being configured to support 
multiple applications; and 

a speech recognition system to detect utterances used to 
control at least one of the applications running on the computer, the 
speech recognition system having a user interface to provide visual 
and * auditory feedback indicating whether an utterance is 
recognized, the user interface being configured to play a first 
audible sound indicating recognition of the utterance and to display 
a graphic that diminishes in size from an original size with the 
passage of time, the graphic returning to the original size after each 
recognized utterance, the user interface being further configured to 
emit a second audible sound when a predetermined response time 
has expired before the utterance has been recognized, the second 
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sound indicating that the speech recognition system has entered a 
dormant state. " 

For the reasons already discussed, the references of record, singly or in 
combination, do not teach or suggest these features. Accordingly, for these 
reasons alone, the 35 USC §103 rejection of claim 27 should be overruled. 
Accordingly, for this additional reason, the 35 USC §103 rejection of claim 27 
should be overruled. 

Claims 28 and 31 depend from claim 27 and are patentably distinguished 
over the references of record by virtue of this dependency. Accordingly, the 35 
USC §103 rejection of claims 28 and 31 should be overruled. 

Claim 33 recites: 

"A collaboration system involving multiple interconnected 
devices comprising: 

a voice input mechanism resident at each of the devices; 

an audio output system resident at each of the devices; and 

a user interface to provide visual and auditory feedback 
indicating when a party located at one of the devices can speak, the 
user interface being configured to play an audible sound when the 
party can begin speaking and to display a graphic that changes with 
lapsing of time to indicate a duration that the party can speak, the 
graphic diminishing in size from an original size with the passage of 
time, the graphic returning to the original size after each recognized 
utterance, wherein the user interface plays an audible sound upon 
recognizing an utterance within the duration that the party can 
speak, the user interface emitting a second audible sound when the 
duration has expired before the utterance has been recognized, the 
second sound indicating that the speech recognition system has 
entered a dormant state. " 

For the reasons already discussed, the references of record, singly or in 
combination, do not teach or suggest these features. Accordingly, for these 
reasons alone, the 35 USC §103 rejection of claim 33 should be overruled. 



14 



Claim 34 recites "changing the graphic to indicate passage of the response 
time such that the graphic diminishes in size from an original size with the passage 
of time", and "responsive to recognizing an utterance, presenting the graphic in the 
original size". For the reasons discussed above in reference to claim 1, the 
references of record, singly or in combination, do not teach or suggest this feature 
of claim 34. 

Additionally, claim 34 recites "responsive to expiration of the response 
time before the audible utterance has been recognized, emitting a second sound to 
indicate that the speech recognition system has entered a dormant state." For the 
reasons discussed above in reference to claim 1, the references of record, singly or 
in combination, do not teach or suggest this feature of claim 34. 

Moreover, nowhere do the references of record teach "playing a first sound 
when an audible utterance is recognized" and "emitting a second sound to indicate 
that the speech recognition system has entered a dormant state." If this feature is 
again rejected, the appellant respectfully requests for the office to point out where 
this feature is taught or suggested in the references. 

Accordingly, for each of these reasons, the 35 USC §103 rejection of claim 
34 should be overruled. 

Claim 39 depends from claim 34 and is allowable over the references of 
record by virtue of this dependency. Accordingly, the 35 USC §103 rejection of 
claim 39 should be overruled. 
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Claims 4-5, 7-8, 11-12, 15, 20, 29-30, 32, 35, and 37 are Not Obvious 
Over St. George in View of Salazar and further in view of 
VanBuskirk in Conformity with 35 U.S.C. §103(a). 

Claims 4-5. 7-8. 11-12. 15. 20. 29-30. 32. 35. and 37 stand rejected under 
35 USC § 103(a) as being unpatentable over St. George in view of Salazar as 
applied to claims 1, 9, 18, 23, 27, and 34, and further in view of VanBuskirk. The 
appellant respectfully traverses this rejection. 

Claims 4, 11, 20, 29, and 37 depend from one of claims 1, 9, 18, 27, or 34. 
For the respective reasons discussed above, in reference to claims 1, 9, 18, 27, and 
34, dependent claims 4, 11, 20, 29, and 37-by virtue of their respective 
dependency on an allowable base claim, are allowable over St. George in view of 
Salazar. 

Claim 4 recites "wherein the user interface displays visual elements in a 
first color and briefly changes to a second color in the event the speech recognition 
engine recognizes the utterance." Claim 11 recites "wherein the user interface 
displays visual elements in a first color and briefly changes to a second color in 
the event the speech recognition engine recognizes one of the utterances." Claim 
20 recites "graphic progress bar briefly changes color when a user input is 
recognized." Claim 29 recites "wherein the user interface displays visual elements 
in a first color and briefly changes to a second color in the event the utterance is 
recognized." And, claim 37 recites "changing a color of the graphic when an 
audible utterance is recognized." 

In addressing these claims, the ACTION on page 4 concedes that neither St. 
George nor Salazar alone or in combination teach or suggest "..interface displays 
visual elements in a first color..." Instead, the ACTION relies on VanBuskirk's 
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teaching of a status bar that changes color to represent volume level of dictated 
speech to conclude it would have been obvious to modify St. George in view of 
Salazar to incorporate the status bar of VanBuskirk to provide a user with an 
additional option to monitor input response time. The appellant respectfully 
disagrees with this conclusion of obviousness for the following reasons. 

As already discussed above, St. George teaches a system wherein nothing is 
recognized during the speech input response time. This is so a user will have the 
opportunity to edit or delete the speech input to avoid sending erroneous data to 
the speech recognition engine. In light of this, the appellant respectfully submits, 
that the ACTION'S suggested modification to St. George (i.e., incorporate the 
status bar of VanBuskirk to provide a user with an additional option to monitor 
input response time) would require the speech recognition engine to interpret 
speech data as it was being input. However, as discussed above with respect to the 
combination St. George in view of Salazar, this is exactly what St. George teaches 
against and avoids, choosing instead to allow the user to edit the and/or delete 
aggregated speech input before it is sent for speech recognition. For these reasons, 
a person ordinary skill in the art at the time of invention would have not would not 
have made the ACTION'S proposed modification to St. George in view of Salazar 
and further in view of VanBuskirk. 

For these reasons, the cited combination does not teach or suggest the 
features of claims 4, 11, 20, 29, and 37. Accordingly, the 35 USC §103 rejection 
of claims 4, 1 1, 20, 29, and 37 should be overruled. 

Claims 5, 12, 30, and 35 depend from one of claims 1, 9, 18, 27, or 34. 
For the respective reasons discussed above, in reference to claims 1,9, 18, 27, and 
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34, dependent claims 5, 12, 30, and 35-by virtue of their respective dependency on 
an allowable base claim, are allowable over St. George in view of Salazar. 

Additionally, in addressing these claims, the action on page 5 admits that 
neither St. George nor Salazar teach or suggest "countdown bar comprises a 
progress bar". Instead, the Office relies on VanBuskirk's status bar that 
graphically represents change in volume level of dictated speech to conclude it 
would have been obvious to modify St. George in view of Salazar to incorporate 
the status bar of VanBuskirk to provide a user with an additional option to monitor 
input response time. However, for the reasons already discussed above with 
respect to claims 4, 1 1 , 20, 29, and 37, a person ordinary skill in the art at the time 
of invention would have not would not have made the ACTION'S proposed 
modification to St. George in view of Salazar and further in view of VanBuskirk. 

Accordingly, the 35 USC §103 rejection of claims 5, 12, 30, and 35 should 
be overruled. 

Claims 7, 15, and 32 depend from one of claims 1, 9, or 27. For the 
respective reasons discussed above, in reference to claims 1, 9, and 27, dependent 
claims 7, 15, and 32-by virtue of their respective dependency on an allowable base 
claim, are allowable over St. George in view of Salazar. 

In addressing these claims, the ACTION concedes that neither St. George 
nor Salazar v teach or suggest "a sleep mode and is awakened to an active mode 
upon detection of a starter utterance", as respectively recited by these claims. 
Instead, the ACTION relies on VanBuskirk's status bar (indicating that a system is 
not active and can be awakened with a proper voice command or by manual means) 
to conclude that the features of these claims are obvious in view of the cited 
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combination. However, combining VanBuskirk's status bar with St. George in 
view of Salazar does not cure the above-discussed deficiencies of St. George in 
view of Salazar. Therefore, the cited combination does not teach or suggest the 
features of claims 7, 15, and 32. 

Accordingly, the 35 USC §103 rejection of claims 7, 15, and 32 should be 
overruled. 

Claim 8 depends from claim 1 and for the reasons discussed above is 
allowable over St. George in view of Salazar by virtue of this dependency. 

In addressing this claim, the ACTION admits that St. George in view of 
Salazar does not teach or suggest the features of claim 8. Instead, the ACTION 
relies on VanBuskirk' teaching of status information to indicate that a system is in 
a sleep mode that can be activated responsive to a command (or manual means) to 
conclude that the features of claim 8 are obvious. The appellant disagrees. 
VanBuskirk's status information and sleep mode that may be activated by a 
command (or manual means) in combination with St. George in view of Salazar 
does not cure the above-discussed deficiencies of St. George in view of Salazar. 
Therefore, the cited combination does not teach or suggest the features of claim 8. 

Accordingly, the 35 USC §103 rejection of claim 8 should be overruled. 

Conclusion 

The Appellant respectfully considers this application to be in condition for 
allowance and respectfully requests the Board to overturn the final rejection and 
that the office pass this application to allowance. 

This brief is being submitted in triplicate. 
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X. Appendix A: claims on appeal 

1 . (Once Amended) A speech recognition system comprising: 
a speech recognition engine to recognize an utterance, the speech 
recognition engine being configured to actively listen for the utterance for a 
predetermined response time, the speech recognition engine being configured to 
enter a dormant state if the utterance is not recognized within the predetermined 
amount of time, the speech recognition system remaining in the dormant state until 
recognition of a starter word that is independent of the utterance; and 

a user interface to provide visual and auditory feedback indicating whether 
the speech recognition engine recognizes the utterance, the user interface being 
configured to: (a) play an audible sound indicating recognition of the utterance; (b) 
display a countdown graphic that changes with lapsing of the predetermined 
response time; (c) restart the countdown graphic in the event the speech 
recognition engine recognizes the utterance. 

3. (Unchanged) A speech recognition system as recited in claim 1, 
wherein the response time is configurable. 

4. (Unchanged) A speech recognition system as recited in claim 1, 
wherein the user interface displays visual elements in a first color and briefly 
changes to a second color in the event the speech recognition engine recognizes 
the utterance. 
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5. (Unchanged) A speech recognition system as recited in claim 1, 
wherein the countdown graphic comprises a progress bar that shortens as the 
response time diminishes. 

6. (Unchanged) A speech recognition system as recited in claim 1, 
wherein the user interface plays another audible sound when the response time has 
elapsed. 

7. (Unchanged) A speech recognition system as recited in claim 1, 
wherein the speech recognition engine is initially in a sleep mode and is awakened 
to an active mode upon detection of a starter utterance, the user interface plays 
another audible sound indicating that the speech recognition engine is in the active 
mode in the event the speech recognition engine recognizes the starter utterance. 

8. (Once Amended) A speech recognition system as recited in claim 1, 
wherein the speech recognition engine is initially in a sleep mode and is awakened 
to an active mode upon depression of a button, the user interface plays another 
audible sound indicating that the speech recognition engine is in the active mode 
in the event the speech recognition engine recognizes a starter utterance. 
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9. (Twice Amended) A speech recognition system comprising: 
an application; 

a vocabulary accessible by the application, the vocabulary holding a set of 
utterances applicable to the application; 

a grammar that holds a subset of the utterances in the vocabulary; 

a speech recognition engine to recognize the utterances in the grammar 
within a predetermined response time, the speech recognition engine being 
configured to enter a dormant state if the utterances are not recognized within the 
predetermined response of time; and 

a user interface to display a countdown graphic that changes with lapsing of 
the response time, wherein the user interface restarts the countdown graphic in the 
event the speech recognition engine recognizes the one of the utterances. 

11. (Unchanged) A speech recognition system as recited in claim 9, 
wherein the user interface displays visual elements in a first color and briefly 
changes to a second color in the event the speech recognition engine recognizes 
one of the utterances. 

12. (Unchanged) A speech recognition system as recited in claim 9, 
wherein the countdown graphic comprises a progress bar that shortens as the 
response time diminishes. 



23 



13. (Once Amended) A speech recognition system as recited in claim 9, 
wherein the user interface plays an audible sound when the speech recognition 
engine recognizes one of the utterances within the predetermined response time. 

14. (Unchanged) A speech recognition system as recited in claim 9, 
wherein the user interface plays an audible sound when the response time has 
elapsed. 

15. (Unchanged) A speech recognition system as recited in claim 9, 
wherein the speech recognition engine is initially in a sleep mode and is awakened 
to an active mode upon detection of a starter utterance, the user interface plays 
another audible sound indicating that the speech recognition engine is in the active 
mode in the event the speech recognition engine recognizes the starter utterance. 

16. (Unchanged) An entertainment system incorporating the speech 
recognition system as recited in claim 9. 

17. (Unchanged) A computing device incorporating the speech 
recognition system as recited in claim 9. 
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18. (Twice Amended) A user interface for an speech recognition system, 
the user interface comprising: 
a display; and 

a graphic progress bar shown on the display that indicates a response time 
in which the speech recognition system is awaiting a user to speak, the progress 
bar shortening with passage of the response time, wherein the graphic progress bar 
is lengthened to its initial position after each recognized user input, wherein the 
user interface plays an audible sound when the speech recognition engine 
recognizes one of the utterances within the predetermined response time, and 
wherein the user interface indicates that the speech recognition engine is in a 
dormant state when at least one of the utterances is not recognized within the 
predetermined response of time. 

20. (Unchanged) A user interface as recited in claim 18, wherein the 
graphic progress bar briefly changes color when a user input is recognized. 

21. (Unchanged) A speech recognition system incorporating the user 
interface as recited in claim 18. 

22. - (Unchanged) A computing device incorporating the user interface as 
recited in claim 18. 
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23. (Once Amended) A user interface for an speech recognition system, 
the user interface comprising: 

a display; 

an audio input to receive audible utterances; 

a graphic shown on the display that indicates a fixed response time in 
which the speech recognition system is awaiting receipt of an utterance via the 
audio input, the graphic diminishing in size with the passage of time, the graphic 
returning to an original size after each recognized utterance; and 

an audio generator to emit a first audible sound when the speech 
recognition system recognizes the utterance, the audio generator being further 
configured to emit a second audible sound when the fixed response time has 
expired before the utterance has been recognized, the second sound indicating that 
the speech recognition system has entered a dormant state. 

24. (Unchanged) A user interface as recited in claim 23, wherein the 
audio generator emits a second audible sound when the speech recognition system 
fails to recognize the utterance within the response time. 

25. (Unchanged) A speech recognition system incorporating the user 
interface as -recited in claim 23. 

26. (Unchanged) A computing device incorporating the user interface as 
recited in claim 23. 
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27. (Once Amended) A vehicle computer system comprising: 
a computer; 

an open platform operating system executing on the computer, the 
operating system being configured to support multiple applications; and 

a speech recognition system to detect utterances used to control at least one 
of the applications running on the computer, the speech recognition system having 
a user interface to provide visual and auditory feedback indicating whether an 
utterance is recognized, the user interface being configured to play a first audible 
sound indicating recognition of the utterance and to display a graphic that 
diminishes in size from an original size with the passage of time, the graphic 
returning to the original size after each recognized utterance, the user interface 
being further configured to emit a second audible sound when a predetermined 
response time has expired before the utterance has been recognized, the second 
sound indicating that the speech recognition system has entered a dormant state. 

28. (Unchanged) A vehicle computer system as recited in claim 27, 
wherein the user interface restarts the graphic in the event the utterance is 
recognized. 

29. - (Unchanged) A vehicle computer system as recited in claim 27, 
wherein the user interface displays visual elements in a first color and briefly 
changes to a second color in the event the utterance is recognized. 



27 



30. (Unchanged) A vehicle computer system as recited in claim 27, 
wherein the graphic comprises a progress bar that shortens as the response time 
passes. 

31. (Unchanged) A vehicle computer system as recited in claim 27, 
wherein the user interface plays another audible sound when the response time has 
elapsed. 

32. (Unchanged) A vehicle computer system as recited in claim 27, 
wherein the speech recognition system is initially in a sleep mode and is awakened 
to an active mode upon detection of a starter utterance, the user interface plays 
another audible sound indicating that the speech recognition system is in the active 
mode in the event the starter utterance is recognized. 
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33. (Twice Amended) A collaboration system involving multiple 
interconnected devices comprising: 

a voice input mechanism resident at each of the devices; 

an audio output system resident at each of the devices; and 

a user interface to provide visual and auditory feedback indicating when a 
party located at one of the devices can speak, the user interface being configured 
to play an audible sound when the party can begin speaking and to display a 
graphic that changes with lapsing of time to indicate a duration that the party can 
speak, the graphic diminishing in size from an original size with the passage of 
time, the graphic returning to the original size after each recognized utterance, 
wherein the user interface plays an audible sound upon recognizing an utterance 
within the duration that the party can speak, the user interface emitting a second 
audible sound when the duration has expired before the utterance has been 
recognized, the second sound indicating that the speech recognition system has 
entered a dormant state. 
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34. (Twice Amended) A method for operating a speech recognition 
system, comprising the following steps: 

initiating a response time in which to receive an audible utterance; 

displaying a graphic representing the response time; 

playing a first sound when an audible utterance is recognized; 

changing the graphic to indicate passage of the response time such that the 
graphic diminishes in size from an original size with the passage of time; 

responsive to recognizing an utterance, presenting the graphic in the 
original size; and 

responsive to expiration of the response time before the audible utterance 
has been recognized, emitting a second sound to indicate that the speech 
recognition system has entered a dormant state. 

35. (Unchanged) A method as recited in claim 34, wherein the 
displaying and changing steps comprise the steps of depicting a progress bar and 
shortening the progress bar as the response time passes. 

37. (Unchanged) A method as recited in claim 34, further comprising the 
step of changing a color of the graphic when an audible utterance is recognized. 

«. 

39. (Unchanged) A method as recited in claim 34, further comprising the 
step of playing a sound when no audible utterance is recognized within the 
response time. 
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